An Intelligent Search Infrastructure for Language Resources on the Web

نویسندگان

  • Timothy Baldwin
  • Steven Bird
  • Baden Hughes
چکیده

Language occupies a central role on the web: most content is expressed in a given language, and most access takes place via natural language input and interfaces. Today, investigation of human language in all its forms depends on access to this vast store of language data. In particular, linguists and language technologists annotate and analyze this data and develop new language resources including grammars, dictionaries, and a raft of new technologies for automatic translation, information extraction, question answering, and so forth. As this new documentation is disseminated on the web, and as the new technologies are in turn deployed on the web, a further round of collection and processing is enabled, closing the loop. For instance, a collection of Japanese text with an aligned English translation can be used for translation studies, for adding examples to bilingual dictionaries, and developing translation systems. These resources can then be used for new purposes, e.g. to provide English speakers access to content stored in Japanese text, or to provide Japanese learners of English with more authentic example sentences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Large-Scale Web Data Collection as a Natural Language Processing Infrastructure

In recent years, language resources acquired from theWeb are released, and these data improve the performance of applications in several NLP tasks. Although the language resources based on the web page unit are useful in NLP tasks and applications such as knowledge acquisition, document retrieval and document summarization, such language resources are not released so far. In this paper, we prop...

متن کامل

The state-of-the-art in web-scale semantic information processing for cloud computing

Based on integrated infrastructure of resource sharing and computing in distributed environment, cloud computing involves the provision of dynamically scalable and provides virtualized resources as services over the Internet. These applications also bring a large scale heterogeneous and distributed information which pose a great challenge in terms of the semantic ambiguity. It is critical for a...

متن کامل

Intelligent Health Solution System

Introduction: In the field of management, the statistics and performance of the deputies and functions of the organization are always of great importance, which requires instant access to the latest status of the system under coverage and minimal forecast of the future situation, to provide quality services Also improve. All of this justifies the existence of an intelligent statistical system w...

متن کامل

Expert Knowledge Management based on Ontology in a Digital Library

The architecture of the future Digital Libraries should be able to allow any users to access available knowledge resources from anywhere and at any time and efficient manner. Moreover to the individual user, there is a great deal of useless information in addition to the substantial amount of useful information. The goal is to investigate how to best combine Artificial Intelligent and Semantic ...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006